A Sampling-Based Heuristic for Tree Search Applied to Grammar Induction
نویسندگان
چکیده
In the eld of Operation Research and Arti cial Intelligence, several stochastic search algorithms have been designed based on the theory of global random search (Zhigljavsky 1991). Basically, those techniques iteratively sample the search space with respect to a probability distribution which is updated according to the result of previous samples and some prede ned strategy. Genetic Algorithms (GAs) (Goldberg 1989) or Greedy Randomized Adaptive Search Procedures (GRASP) (Feo & Resende 1995) are two particular instances of this paradigm. In this paper, we present SAGE, a search algorithm based on the same fundamental mechanisms as those techniques. However, it addresses a class of problems for which it is di cult to design transformation operators to perform local search because of intrinsic constraints in the de nition of the problem itself. For those problems, a procedural approach is the natural way to construct solutions, resulting in a state space represented as a tree or a DAG. The aim of this paper is to describe the underlying heuristics used by SAGE to address problems belonging to that class. The performance of SAGE is analyzed on the problem of grammar induction and its successful application to problems from the recent Abbadingo DFA learning competition is presented.
منابع مشابه
A Stochastic Search Approach to Grammar Induction
This paper describes a new sampling-based heuristic for tree search named SAGE and presents an analysis of its performance on the problem of grammar induction. This last work has been inspired by the Abbadingo DFA learning competition [14] which took place between Mars and November 1997. SAGE ended up as one of the two winners in that competition. The second winning algorithm, rst proposed by R...
متن کاملA heuristic approach for multi-stage sequence-dependent group scheduling problems
We present several heuristic algorithms based on tabu search for solving the multi-stage sequence-dependent group scheduling (SDGS) problem by considering minimization of makespan as the criterion. As the problem is recognized to be strongly NP-hard, several meta (tabu) search-based solution algorithms are developed to efficiently solve industry-size problem instances. Also, two different initi...
متن کاملActive Stratified Sampling with Clustering-Based Type Systems for Predicting the Search Tree Size of Problems with Real-Valued Heuristics
In this paper we advance the line of research launched by Knuth which was later improved by Chen for predicting the size of the search tree expanded by heuristic search algorithms such as IDA*. Chen’s Stratified Sampling (SS) uses a partition of the nodes in the search tree called type system to guide its sampling. Recent work has shown that SS using type systems based on integer-valued heurist...
متن کاملA Gibbs Sampler for Phrasal Synchronous Grammar Induction
We present a phrasal synchronous grammar model of translational equivalence. Unlike previous approaches, we do not resort to heuristics or constraints from a word-alignment model, but instead directly induce a synchronous grammar from parallel sentence-aligned corpora. We use a hierarchical Bayesian prior to bias towards compact grammars with small translation units. Inference is performed usin...
متن کاملA Bayesian Model of Syntax-Directed Tree to String Grammar Induction
Tree based translation models are a compelling means of integrating linguistic information into machine translation. Syntax can inform lexical selection and reordering choices and thereby improve translation quality. Research to date has focussed primarily on decoding with such models, but less on the difficult problem of inducing the bilingual grammar from data. We propose a generative Bayesia...
متن کامل